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Conventional wisdom suggests that only the estimated intercept is affected by 
imposition of a zero censoring threshold on a Tobit model. This is true for 
Heckman-Lee estimation. For maximum likelihood (ML) estimation, however, it is 
only true if the censoring threshold is known and is subtracted from the dependent 
variable. Failure to properly transform the dependent variable prior to ML 
estimation of a zero threshold Tobit model will generally bias the coefficient 
estimates. A long neglected topic is ML estimation of a Tobit model with common, 
but unknown, censoring threshold. This paper shows that the ML estimator of the 
censoring threshold is the minimum order statistic from the observed subsample, and 
that existing software for estimation of a zero-threshold Tobit model is easily 
adapted to include estimation of the censoring threshold. 



L INTRODUCTION 

Conventional wisdom suggests that only the estimated 
intercept is affected by imposition of a zero censoring 
threshold on a Tobit model.' This is true for Heckman-Lee 
(HL) estimation. For maximum likehhood (ML) estima- 
tion, however, it is only true if the censoring threshold is 
known and is subtracted from the dependent variable. 
Failure to properly transform the dependent variable prior 
to ML estimation of a zero threshold Tobit model will 
generally bias the coefficient estimates. A long neglected 
topic is ML estimation of a Tobit model with common, but 
unknown, censoring threshold. This paper shows that the 
ML estimator of the censoring threshold is the minimum 
order statistic from the observed subsample, and that 
existing software for estimation of a zero-threshold Tobit 
model is easily adapted to include estimation of the 
censoring threshold. 



II. THE TOBIT MODEL 

The model considered in this paper has been classified as a 
'Type I Tobit' model by Amemiya (1985). The form of the 
latent regression is: 

7* = a + Xi^ + CTe,• 
where e,~i.i.d.N(0, 1). The dependent variable Y* is not 
observed. Instead, a censoring indicator, is observed 
where 

/, = 1 if Y*^& 

and 

/, = 0 if Y* <8 

The parameter ^ is a common censoring threshold.^ The 
observed dependent variable, denoted 7,, equals Y* if /, = 1 . 



' Examples are numerous. Amemiya (1985, 363) denotes the censoring limit by vo, arid states that a zero censoring limit can be imposed 'without essentially 
changing the model, whether is known or unknown, because Vo can be absorbed into the constant term of the regression'. Maddala (1983, 159) specifies 
the Tobit model with zero censoring limit in his Equation (6.1), and states that 'the Tobit model can be specified as in Equation (6.1) without loss of 
generality'. Finally, Greene (2000, 906) states that he will 'assume that the censoring point is zero, although this is only a convenient normalization'. 
^ As presented here, an observation is censored if it is strictly less than In what follows, the only result that changes if an observation is censored when it 
is less than or equal to &, is that the maximum likelihood estimator of & corresponds to the suprenium of the likelihood function rather than the maximum. 
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The value of 7, is missing (although commonly coded as 
zero) if /, = 0. The vector of regressors, Xi, is observed 
regardless of any censoring of the dependent variable. 

If 5 is known, then the model may be expressed in 
equivalent form as a zero-threshold model. Subtracting 8 
from both sides of the latent regression gives 

J* = y + Xi^ + asi 

where y*=Y* — 8 and y = a — 8. The censoring indicator, 
is determined as 



and 



/, = 1 if j*^0 



7, = 0 if J* < 0 



The observed dependent variable, j,, equals y* if /, = 1 . 
The value of yt is missing if = 0. 

Heckman (1976) and Lee (1976) provide a particularly 
simple estimator for the Tobit model. In the first stage, a 
Probit model is estimated using only the qualitative infor- 
mation in the observed values of The log-likelihood 
function of the Probit model is 



\r\L{a, fi, a, 8) = X! W , In 



1 - O 



8 — a — 



+(1 - /;) ln$ 



^)] 
(^)l 



where <!>() denotes the standard normal distribution func- 
tion. Without quantitative information, only the standar- 
dized parameters A. = (a — 8)/a and n— P/a are identified. 
The identification conditions a = 1 and 5 = 0 may be 
imposed on the Probit model without loss of generahty.^ 
The first stage estimates of A. and n are then used to construct 
an auxiliary regressor based on the conditional expectation 
ii(e, I /, = !). In the second stage, ordinary least squares 
(OLS) is applied to the subsample regression function 



Yi = a + XiP + a 



(t){-i - Xijr) 
1 - ^{-i - Xijr) 



r]i 



where 0() denotes the standard normal density function. 
The zero threshold form of the subsample regression func- 
tion is obtained by subtracting 8 from both sides of the 
equation. Since the second stage regression is estimated by 
OLS, the only effect of subtracting 8 from the dependent 
variable is to reduce the estimated intercept accordingly. 
The estimate of /}, and the maximized value of the hkelihood 
function from the second stage regression are unchanged. 
This is probably why 8 = 0 is often mistaken as an 
identification condition for the Tobit model. 



Assuming that 8 is known, the log-likelihood function of 
the Tobit model is 



In Lia, p,a) = J2y> 



i=l 



The zero threshold form of the log-likelihood function is 



In(CT) + ln(p 
+ (1 - Ji) In CD 



'y. -y- Xi0 



(=^)1 



It is important to note that the zero threshold log- 
likelihood function is expressed in terms of the transformed 
dependent variable, F, — 8, not the original dependent 
variable, 7,. If the dependent variable j, is properly con- 
structed, then maximization of In L(y, p, a) is equivalent to 
maximization of \nL((x, P,a). The estimates of /3 and a are 
identical, and the estimates of y and a are related as a = 
y+ 8. When the censoring threshold is known, either model 
can be estimated, provided that the proper dependent 
variable is constructed. 

This does not imply that the restriction 5 = 0 is an 
identification condition for the Tobit model. Failure to 
construct the transformed dependent variable, yi = F,- — 8, 
when estimating a zero threshold Tobit model will constrain 
the maximized value of the likelihood function, and will 
result in biased coefficient estimates. The intuition behind 
this bias is simple. When expressed as a zero threshold 
model, the intercept of the regression model is reduced by 8. 
If the dependent variable is not properly transformed, then 
this reduction in mean is not reflected in the empirical 
distribution of 7„ and any decrease in the intercept 
degrades the fit of the observed subsample. Since the 
Tobit ML estimates are the solution to a set of simultaneous 
nonlinear implicit functions, the effects of a failure to 
properly transform the dependent variable will spill over to 
the estimates of p and cr. In contrast, the HL estimator of p 
is invariant to the choice of 8, because it fails to impose the 
functional relationship between the parameters in the first 
and second stages of estimation. Of course, this is also the 
source of the inefficiency of the HL estimator. 

III. ML ESTIMATION OF THE CENSORING 
THRESHOLD 

This section considers the case where 8 is unknown."* The 
maximization problem is complicated by the fact that the 



In contrast, when the quantitative information in the observed subsample of Y/ is included, the parameters a, fi, S, and a are aU identified. 
''To my knowledge, estimation of the Tobit model when S is unknown has not been considered in the literature. Of course, there have been many 
generalizations of the Tobit model that relax the assumption of a common censoring threshold. These models are more appropriate in many applications 
than the standard Tobit model. Nevertheless, the standard Tobit model is still widely used. 
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sample space of the observed dependent variable, F„ is a 
function of the unknown parameter The likelihood 
function above, In L{a, fi, a), is only vahd for values of & 
that keep all observed values of F, in the sample space. 
Specifically, it is required that & < T, for all observations in 
the observed subsample. If 5 > 7, for any observation, there 
is a contradiction; the observed value of F, should have 
been censored. This restriction may be simplified to S ^ Fj, 
where Fi denotes the minimum order statistic from the 
observed subsample. The minimum order statistic for the 
observed subsample provides an upper bound for the cen- 
soring threshold. 

The log-likelihood function of the Tobit model may be 
generalized to include this restriction on the sample space 
through the use of an indicator function. Specifically, let 
r{&) = 1 if 5 < Fi and r(3) = 0 if S > F, . If r(5) = 0, a value 
of & has been chosen that is inconsistent with the observed 
data. Using this notation, the log-likelihood function may 
be written as 



-ln(CT) + In 



+ In T{S) 



+ 



This generalization simply recognizes that the probabihty 
of (F,,/,) pairs outside the sample space is zero.^ 
Where it exists, the score equation for 5 is 



a In L(, 



which is always positive. The log-hkelihood function is 
increasing in b, for any 5< Fi. The log-likelihood function 
is discontinuous at 5= Fi, however. For any S> Fi, the 
value of r(i5) is zero, and the log-hkelihood function falls 
precipitously; essentially to minus infinity. The maximum 
to the likelihood function over & occurs at Y\. The ML 
estimator, &, is the minimum order statistic from the 
observed subsample of Y? 



Substituting 5 = Fi into In L{a, yS, a, 8) gives the concen- 
trated log-likelihood function 



In L(a,P, (tJ) ^J2v' 



-ln((T) -h In (/> 



+ (1 - Ji) In (fc 



S — a — X; , 



Letting j,- = F,- — 5 and y — a — 8, the concentrated log- 
likelihood function may be written as 



\nL(y,p,a,S)^Y^\j, 



-ln(CT) + \n<p 
+ (1 -/,) InO 



(=^)1 



This is just the log-Hkelihood function of the zero threshold 
Tobit model, where the ML estimator 8 is subtracted from 
the dependent variable. Consequently, software for estima- 
tion of a zero threshold Tobit model is easily adapted to 
the case of an unknown censoring threshold.^ Conventional 
test statistics for the Tobit model are vahd conditional on 8. 



IV. DATA GENERATION 

The purpose of the Monte Carlo portion of this study is to 
examine how imposition of a zero censoring threshold 
affects the estimate of p when the true censoring threshold 
is positive. In order to focus on this topic, the structure of 
the model is kept as simple as possible. The model contains 
an intercept, a, and a single regressor, X, with coefficient, 
y6. The regressor, X, is a random draw of independent 
standard normals, and is fixed in repeated sampling. The 
assumption of a zero mean and unit variance for the 
regressor involves no loss in generahty. In practice, 
standardizing a regressor will simply scale the coefficient 
estimate without affecting the precision of the estimate 
or the fit of the model. The disturbance, e, is a sequence 
of independent standard normals that are statistically 
independent of X. Given this structure, the unconditional 
mean of the latent dependent variable F* is controlled by 
the single parameter a. 



^The sample space of the latent dependent variable, y*, is the set of real niimbers, while the sample space of the observed dependent variable, y,, is the set 
of nonnegative real numbers. 

' Current econometric software is written for the case of known S, and does not control for the discontinuity in the hkelihood function induced by the 
restricted sample space of Y. For example, if the transformed dependent variable, j,- = Yj — S, is constructed using an invahd value of S (one exceeding Y{), 
and zero threshold software is used to maximize lnL(y, y6,(r), incorrect values for the likelihood function will be reported without any error message or 
warning. Furthermore, for invalid S, the reported value of the log-hkelihood function will differ from one software package to the next, depending on how 
the censored subsample is determined. (See footnote 8.) 

' The problem is similar to that of finding the ML estimator of 6 for a random sample of uniform random variables on the interval (0, 6). The ML 
estimator for this problem is the largest order statistic. 

^ A practical problem arises with software packages that determine the censored subsample from the nimierical value of the dependent variable, rather 
than from a separate binary variable that indicates censoring. If observations for which j, < 0 are treated as censored, then the observation corresponding 
to Yi will be switched from the observed to the censored subsample after construction of j,-. The simplest solution to this problem is to choose as the 
estimate of 5 a number 'slightly' smaller than Yj. 
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Two factors affecting the performance of Tobit estima- 
tors are the fit of the latent regression and the degree of 
censoring. Nelson (1984) shows that the variance of the 
Tobit estimator is affected proportionally by a change in a. 
When comparing the relative performance of the estimator, 
the choice of ct is arbitrary. Changes in cr affect the absolute, 
but not the relative scale of the variances. If the normal- 
ization ff = (l — yS^)"^ is adopted, then the fit of the latent 
regression is controlled by the single parameter, jS. For this 
choice of a, the slope coefficient, fi, corresponds to the 
correlation coefficient between Y* and X. The parameters 8 
and a affect the probability of censoring only through the 
difference 5 — a. Constructing a as a = 8 — 9, where 6 is the 
critical value of the standard normal distribution function 
that gives the desired degree of censoring, results in equi- 
valent variation in a and 5 that leaves the probabihty of 
censoring unchanged. This allows an examination of the 
effect of an increase in 5 on the performance of the zero 
threshold Tobit model, without confounding variation in 
the degree of censoring. 

A brief summary of the Monte Carlo process is as follows: 

(1) The regressor, X, is drawn. It is fixed across 
repetitions. 

(2) The disturbance e is drawn. Given the regressor, X, 
and the parameters yS, 5, and 6, the values of Y and / 
are computed. 

(3) Coefficient estimates are obtained for each of the 
estimators. 

(4) Steps 2 and 3 are repeated on successive repetitions, 
and sample moments for the estimators are compiled 
across repetitions. 

The data generation process was carefully structured in 
order to limit intra-experiment random variation.^ Each 
estimator is applied to the same sequence of data sets for 
any given parameter combination, (/3, S, 6). This will limit 
random variation in comparisons across estimators. 
In addition, the same sequence of independent standard 
normal errors, e, will be used to construct the data sequence 
{Y,J) required for each distinct parameter combination.'" 
This will limit random variation in comparisons across 
parameter values. 



V. RESULTS 

Zero threshold software may be used to obtain ML 
estimates of the Tobit model, whether 3 is known or 
unknown. The essential step is construction of the proper 
dependent variable. If 8 is known, the user must construct 



the transformed dependent variable, j, — 7, — 5. This 
estimator will be referred to as the KT-MLE (known 
threshold). If 5 is unknown, the user must construct 
yi— Yi — Yy. This will be called the MOS-MLE (minimum 
order statistic). Use of the original dependent variable, F„ 
in conjunction with zero threshold software will be called 
the ZT-MLE (zero threshold). For nonzero 5, this 
estimator is generally biased. 

Section II showed that the HL estimator of fi is invariant 
to the value of 5. The MOS-MLE and KT-MLE are also 
invariant to variation in 3, provided the degree of censoring 
is held constant. Recall that the degree of censoring is held 
constant in the face of an increase in &, by imposing a 
symmetric increase in a that leaves the difference 8 — a 
unchanged. Since an increase in a results in one-to-one 
increases in 7, and Y^, simultaneous increases in a and 8 
will leave the transformed dependent variables, 7, — Y\ 
and Yi — 8, unchanged. Only the ZT-MLE will be affected 
by variation in 8. 

The results of this section are based on 500 repetitions of 
samples of size 100. Figure 1 plots the mean bias of the 
estimate, as a percentage of the true fi, against the true 
value of the censoring threshold, 8. Figure 2 plots the root 
mean square error of the estimate, again as a percentage of 
the true p, against the censoring threshold. The mean 
values appear as a solid line, while a 95% asymptotic 
confidence interval is given by the surrounding dotted 
lines. Results are shown for censored subsamples of 25%, 
50% and 75%. In both figures, the explanatory power of 
the latent regression is 50%." 

Figure 1 shows that the mean bias of the ZT-MLE is 
increasing in 8 and increasing in the degree of censoring. 
The mean bias of the ZT-MLE is linear in 8 for any given 
degree of censoring. For example, with a censored 
subsample of 50%, a one standard deviation increase in 8 
results in a 75% increase in mean bias.'^ As noted earher, 
the MOS-MLE and HL estimates of p are invariant to 
changes in 8. Since the mean bias of both of these 
estimators is statistically zero, results are only reported 
for the case of censored subsamples of 25%. Finally, the 
mean bias of the KT-MLE (for all values of 8) is given by 
the vertical intercept of the ZT-MLE. The mean bias of the 
KT-MLE is also statistically zero. 

Figure 2 reports the root mean square error (RMSE) of 
the estimators as a function of 8. The RMSE of the ZT- 
MLE is increasing in 8 and increasing in the degree of 
censoring. For 8 greater than 0.5, the RMSE is essentially 
linear in 8 for any given degree of censoring. For a censored 
subsample of 50%, a one standard deviation increase in 8 
results in approximately a 90% increase in RMSE. 



'See Hendry (1984). 

'"The independent standard normals were obtained with the algorithm of Forsythe et al. (1977). 

"The results obtained when the explanatory power was increased to 75% or decreased to 25% were almost indistinguishable from those presented in 
Figs 1 and 2. The mean values of each specification generally fell within the envelope formed by the 95% confidence intervals. 
'■^ Examination of the log-likelihood function shows that it is the size of S relative to tr that determines its empirical significance. 
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Fig. 1. Mean bias {%) as a function of S (n = 100, repetitions = 500, = 0.50) 




Fig. 2. Root mean square error (%) as a function of S (n =100, repetitions = 500, = 0.50) 
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Results for the HL and MOS-MLE are only reported for 
censored subsamples of 75%. This figure represents a 
'worst case' scenario for the HL estimator. As would be 
expected, the MOS-MLE has a MSB advantage on the HL 
estimator. As the degree of censoring is reduced, the 
RMSE of both of these estimators falls. The size of the 
MSE advantage diminishes as the degree of censoring is 
reduced. 

Despite its bias, the ZT-MLE has a MSE advantage over 
the HL estimator for sufficiently small 5. With censored 
subsamples of 75%, the RMSE of the ZT-MLE is lower 
than that of the HL estimator for 5 less than about 1.1 
standard deviations. As the degree of censoring falls, the 
RMSE of the HL estimator falls. For censored subsamples 
of 25% (not shown), the RMSE of the ZT-MLE is lower 
than that of the HL estimator for 5 less than about 0.75 
standard deviations. This is just a reflection of the relative 
inefficiency of the HL estimator. 

Finally, the RMSE of the KT-MLE (for all values of S) is 
given by the vertical intercept of the ZT-MLE. The RMSE 
of the KT-MLE and MOS-MLE are virtually identical for 
hke degrees of censoring. Since imposing a vahd constraint 
can only improve the precision of the estimates, the KT- 
MLE is expected to have a MSE advantage over the MOS- 



MLE. The results of Fig. 2 show that this MSE advantage 
is trivial for the sample sizes considered here. Even for 
observed subsamples averaging 25 observations (where 
75% of 100 observations are censored), the precision of the 
MOS estimator of & is sufficient to yield almost indis- 
tinguishable results. 

While the ZT-MLE of p is biased upwards, its 
corresponding ?-statistic is biased downward. Figure 3 
depicts the relationship between the censoring threshold, 8, 
and the mean value of the f-statistic obtained under the 
null hypothesis /3 = 0, for different degrees of censoring and 
explanatory power. First, the degree of censoring is fixed at 
75%, and the degree of explanatory power, represented by 
i?^, is decreased from 75%, to 50%, to 25%. In each case, 
the mean value of the r-statistic is decreasing in 5, and this 
relationship is shifted downward by a decrease in 
explanatory power. For sufficiently large values of 5 and 
sufficiently small values of R^, the outcome of the test is 
altered by use of the ZT-MLE. 

Figure 3 also illustrates the effects of variation in the 
degree of censoring. The degree of explanatory power is 
fixed at 25%), and the degree of censoring is increased from 
25%, to 50%, to 75%. The mean value of the r-statistic is 
again decreasing in 5, and the relationship is shifted 
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downward by an increase in the degree of censoring. For 
sufficiently large values of & and a sufficiently high degree 
of censoring, the outcome of the test is altered by use of the 
ZT-MLE. 

VI. CONCLUSION 

This paper shows that proper treatment of the censoring 
threshold when estimating a Tobit model is no trivial 
matter. If the censoring threshold is known, it must be 
subtracted from the dependent variable prior to ML 
estimation of a zero-threshold model. Failure to do so 
will result in an upward bias in the coefficient estimates, 
and a downward bias in conventional f-statistics. The ML 
estimate of a common censoring threshold is shown to be 
the minimum order statistic from the observed subsample. 
Existing software for estimation of the zero-threshold 
Tobit model is easily adapted to this case, by simply 
subtracting the ML estimate of the censoring threshold 
from the dependent variable prior to use. The MSE of this 
generalized ML estimator is found to rival that of the 
conventional ML estimator (where the censoring threshold 
is known), even for relatively small observed subsamples. 
Whether the censoring threshold is known or unknown, 
failure to properly transform the dependent variable 



mis-specifies the Tobit model and results in a substantial 
increase in MSE. 
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